Heat Map

Word Clouding

Word Clouding of Crime Types: The top number of crimes appeared in 5 years


After reviewing the number trends and density of crime, our group also want to dig further about the primary type of crime with high incidence in Chicago. As a result, we decide to make a word clouding to give greater prominence to crime type that appear more frequently. According to this figure 3, the most frequently occurring type of crime in Chicago from 2017 to 2021 is narcotics, followed by weapons violation, theft, battery, assault, other offense, and criminal trespass etc.

---
title: "Exploratory Analysis Dashboard"
output: 
  flexdashboard::flex_dashboard:
    social: menu
    source: embed
    theme: flatly
---

```{r setup}
library(tidyverse)
library(leaflet)
library(lubridate)
library(ggExtra)
library(plotly)
library(maps)
library(mapdata)
library(ggthemes)
library(mapproj)
library(ggthemes)
library(gganimate)
library(viridis)
library(wordcloud)
library(RColorBrewer)
library(tm)
library(dplyr)
library(scales)
library(gganimate)
theme_set(theme_minimal() + theme(legend.position = "bottom"))
```

```{r}
crime_full=read.csv("/Users/yitian/Desktop/p8105_final_project/data/five_year_data.csv")
```


Trends in Time{.storyboard}
=========================================
### The Number of Crimes Monthly Trends Over 5 Years

```{r, warning=FALSE, message=FALSE}
count_overtime = 
  crime_full %>% 
  group_by(year,month) %>% 
  summarize(cases = n())
count_overtime %>%
  plot_ly(x = ~month, y = ~cases, color=~year, type = "scatter",mode = "lines+markers")  %>% 
  layout(
    title = "Figure 1: The Number of Crime Monthly Trends From 2017 to 2021 in Chicago.",
    xaxis = list(title = "Month"),
    yaxis = list(title = "Total Crime Cases")
  )
```

***
This figure shows how the number of crime cases in Chicago changed monthly over the years. According to this graph, 2017, 2018, and 2019 show a similar trend in the number of crimes. The number of crimes per year is between 3,119 and 4,574 from 2017 to 2021. However, the graph presents a significant trend difference in 2020 and 2021. The number of crimes in 2020 stay in the same trend as previous years’ in January and February but drop immediately in March. The number of crimes reached the lowest point in April 2020, with only 1,119 crime cases this month. The number of crimes in 2021 seems lower than in 2020 in this graph. Since the crime case data of 2021 is only to the present, the database does not include crimes that happened in December 2021. Considering the start time of the COVID-19 pandemic coincide with the drop in the number of crimes, the rapid decrease of crime cases from March 2020 is probably associated with the prevalence of COVID-19.

Heat Map{.storyboard}
=========================================
### The Monthly Density Trends of Crimes Over 5 Years in Chicago

```{r, message=FALSE}
heatmap_plot = crime_full %>%
  mutate(month = month.abb[as.numeric(month)],
         month = fct_rev(factor(month, levels = month.abb))) %>% 
  group_by(year, month, day) %>% 
  summarise(n_crimes = n()) %>% 
  mutate(day = as.numeric(day))
 
heatmap_plot = heatmap_plot %>% 
  ggplot(aes(x = day, y = month, fill = n_crimes))+
  geom_tile(color = "white",size = 0.1) + 
  scale_fill_viridis(name = "Number of Crimes",option = "C") + 
  facet_grid(.~ year) +
  scale_x_continuous(breaks = c(1,10,20,31)) + 
  theme_minimal(base_size = 8) + 
  labs(title = "Figure 2: The Monthly Density Trends of Crimes from 2017 to 2021 in Chicago", x = "Day", y = "Month") + 
  theme(legend.position = "bottom")+
  theme(plot.title = element_text(size = 14))+
  theme(axis.text.y = element_text(size = 6)) +
  theme(strip.background = element_rect(colour = "white"))+
  theme(plot.title = element_text(hjust = 0))+
  theme(axis.ticks = element_blank())+
  theme(axis.text = element_text(size = 7))+
  theme(legend.title = element_text(size = 8))+
  theme(legend.text = element_text(size = 6))+
  removeGrid()
ggplotly(heatmap_plot + theme(legend.position = "none"))
```

***
This heat map gives a better visualization for the monthly density trend of crimes from 2017 to 2021 in Chicago. It clearly shows that in 2017, 2018, 2019, and the first two months in 2020, there is a much higher density of crimes happened during the year. The crime case density drops rapidly in the middle of March in 2020 and reaches the lowest point in April 2020. Comparing the density of crimes in 2017, 2018, and 2019, we can see that crimes are more likely to happen in the middle of the year (from May to September) and are less likely to happen at the beginning and the end of the year. This graph also clearly shows that, generally, Christmas Day has the lowest crime rate of the year.

Word Clouding{.storyboard}
=========================================
### Word Clouding of Crime Types: The top number of crimes appeared in 5 years

```{r, warning=FALSE, message=FALSE}
crime = crime_full %>%
   group_by(primary_type) %>% 
   summarise(n_crime = n())
   set.seed(555)
   wordcloud(words = crime$primary_type, freq = crime$n_crime, scale = c(3, .8),min.freq = 1,
      max.words=200, random.order=FALSE, rot.per=0.35, 
      colors=brewer.pal(8, "Dark2"))
title( "Figure 3: Wordclouding of the top Number of Crimes Over 5 Years")
```

***
After reviewing the number trends and density of crime, our group also want to dig further about the primary type of crime with high incidence in Chicago. As a result, we decide to make a word clouding to give greater prominence to crime type that appear more frequently.
According to this figure 3, the most frequently occurring type of crime in Chicago from 2017 to 2021 is narcotics, followed by weapons violation, theft, battery, assault, other offense, and criminal trespass etc.

Trends in Location{.storyboard}
=========================================
### Top 10 Locations With High Criminal Incidence Over 5 Years

```{r, warning=FALSE, message=FALSE}
## clean a new dataset with ranking of numbers and adjusted numbers
crime_loc = crime_full %>%
  group_by(month,location_description) %>% 
  summarise(n = n()) %>% 
  mutate(rank = rank(-n),
         Value_rel = n/n[rank == 1],
         Value_lbl = paste0(" ",n)) %>% 
  filter(rank <= 10)

## form multiple static plots
staticplot = ggplot(crime_loc, aes(rank, group = location_description, 
                fill = as.factor(location_description), color = as.factor(location_description))) +
  geom_tile(aes(y = n / 2,
                height = n,
                width = 0.9), alpha = 0.8, color = NA) +
  geom_text(aes(y = 0, label = paste(location_description, " ")), vjust = 0.2, hjust = 1) +
  geom_text(aes(y = n,label = Value_lbl, hjust = 0)) +
  coord_flip(clip = "off", expand = FALSE) +
  scale_x_reverse() +
  scale_fill_viridis_d(option = "B") +
  scale_color_viridis_d(option = "B") +
  theme_minimal() +
  theme(axis.line = element_blank(),
        axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        axis.ticks = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        legend.position = "none",
        panel.background = element_blank(),
        panel.border = element_blank(),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        panel.grid.major.x = element_line( size = .1, color = "grey" ),
        panel.grid.minor.x = element_line( size = .1, color = "grey" ),
        plot.title = element_text(size = 20, hjust = 0.5, face = "bold", vjust = 2),
        plot.subtitle = element_text(size = 14, hjust = 0.5, face = "italic", color = "grey"),
        plot.background = element_blank(),
        plot.margin = margin(2,2, 2, 4, "cm"))

## convert static plots to animated ones
anim_plot = staticplot + 
  transition_states(month, transition_length = 4, state_length = 1) +
  ease_aes('sine-in-out') +
  labs(title = 'Figure 4: Number of Crime in Chicago Over 5 Years : [Month: {closest_state}]',
       subtitle = "Top 10 Location")  

animate(anim_plot, 150, fps = 20, width = 780, height = 560)
```

***

In addition, our group also sorted and ranked the data, and compiled the top 10 locations with high criminal incidence from 2017 to 2021 into a dynamic bar graph to get a more in-depth understanding of the crime situation. Through this motion chart, we can visually observe the names of the crime areas, the number of crime incidents, and their changing trends.It is clear that streets and sidewalks are the locations with high crime rates during the five-year period.
It is also interesting to note that these two locations always reach a peak in January, July and August, respectively, over the five-year period, and then drop extremely quickly to a low in December. Meanwhile, crime rates in stores (department stores, mall retail stores and food stores) and restaurants can be relatively high in January and December. The number of cases in stores and restaurants will be much higher in December and January, while crime in apartments and residential areas will be more frequent at other times.